*************
*Playing to the Gallery: Emotive Rhetoric in Parliaments
*Moritz Osnabruegge, Sara B. Hobolt, Toni Rodon
*Analysis of parliamentary speeches held in the UK House of Commons
*************


clear


*Specify the path here
cd "path"


import delimited "uk_data.csv", encoding(UTF-8) clear 


**Table 1: Data on speeches on the House of Commons
table period /*All speeches*/
table pm_questions period /*PMQs*/
table queen_debate_day1 period /*Queen's Speech debate: Opening day*/
table queen_debate_others /*Queen's Speech debate: Other days*/
table m_questions period /*MQs*/
table u_questions period /*Urgent Questions*/


**Table A2: Summary statistics: House of Commons debates
sutex emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet chair government female age electoral_cycle linear_trend, minmax


**Table 3 in the manuscript and Table A.4: OLS regression analysis of emotive rhetoric

*Model 1: OLS baseline
reg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions, vce(cluster id_mp)

*Model 2: OLS with time trend
reg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend, vce(cluster id_mp) 

*Model 3: OLS with time trend and MP fixed-effects
areg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend, vce(cluster id_mp) a(id_mp)

*Model 4: OLS with time trend, party fixed-effects and controls
areg emotive_rhetoric  pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet chair government female age electoral_cycle linear_trend, vce(cluster id_mp) a(party)

*Model 5: OLS with time trend, MP fixed effects and weighting by speech length
areg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend [aw=words], vce(cluster id_mp) a(id_mp)

		
		
**Figure 3: Predicted level of emotive rhetoric by type of debate and 95 % confidence intervals

areg emotive_rhetoric i.pm_questions i.queen_debate_day1 i.queen_debate_others i.m_questions i.u_questions i.leader i.prime_minister i.senior_minister i.shadow i.cabinet i.chair i.government i.female age electoral_cycle linear_trend, vce(cluster id_mp) a(party)

margin, at(pm_questions=1 queen_debate_day1=0 queen_debate_others=0 m_questions=0 u_questions=0) 
matrix r1 = r(table)
mat2txt, mat(r1) saving("pm_questions.txt") replace

margin, at(pm_questions=0  queen_debate_day1=1 queen_debate_others=0 m_questions=0 u_questions=0) 
matrix r2 = r(table)
mat2txt, mat(r2) saving("queen_debate_day1.txt") replace

margin, at(pm_questions=0  queen_debate_day1=0 queen_debate_others=1 m_questions=0 u_questions=0) 
matrix r3 = r(table)
mat2txt, mat(r3) saving("queen_debate_others.txt") replace

margin, at(pm_questions=0  queen_debate_day1=0 queen_debate_others=0 m_questions=1 u_questions=0) 
matrix r4 = r(table)
mat2txt, mat(r4) saving("m_questions.txt") replace

margin, at(pm_questions=0 queen_debate_day1=0 queen_debate_others=0 m_questions=0 u_questions=1) 
matrix r5 = r(table)
mat2txt, mat(r5) saving("u_questions.txt") replace

margin, at(pm_questions=0  queen_debate_day1=0 queen_debate_others=0 m_questions=0 u_questions=0) 
matrix r6 = r(table)
mat2txt, mat(r6) saving("others.txt") replace



**Tables 4 and A5: OLS regression analysis of emotive rhetoric with topic fixed effects

*Generate numeric variables 
encode party, gen(party2)
encode top_topic, gen(top_topic2)

*Model 1: OLS baseline with topic fixed-effects
areg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions, vce(cluster id_mp) a(top_topic)

*Model 2: OLS with time trend and topic fixed-effects
areg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend, vce(cluster id_mp) a(top_topic)

*Model 3: OLS baseline with MP and topic fixed-effects
areg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend i.top_topic2, vce(cluster id_mp) a(id_mp)

*Model 4: OLS with time trend, party fixed-effects, controls and topic fixed-effects
areg emotive_rhetoric  pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet government female age chair electoral_cycle linear_trend ib2.party2, vce(cluster id_mp) a(top_topic)

*Model 5:  OLS with time trend, MP fixed-effects, weighting by speech length and topic fixed-effects
areg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend i.top_topic2 [aw=words], vce(cluster id_mp) a(id_mp)



**Table A7: OLS regression analysis of emotive rhetoric (only emotive words)

*Model 1: OLS baseline
reg emotive_words pm_questions queen_debate_day1 queen_debate_others m_questions u_questions, vce(cluster id_mp)

*Model 2: OLS with time trend
reg emotive_words pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend, vce(cluster id_mp) 

*Model 3: OLS with time trend and MP fixed-effects
areg emotive_words pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend, vce(cluster id_mp) a(id_mp)

*Model 4: OLS with time trend, party fixed-effects and controls
areg emotive_words pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet chair government female age electoral_cycle linear_trend, vce(cluster id_mp) a(party)

*Model 5: OLS with time trend, MP fixed-effects and weighting by speech length
areg emotive_words pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend [aw=words], vce(cluster id_mp) a(id_mp)



**Table A8: OLS regression analysis of emotive rhetoric (using Lowe et al. scaling procedure)

*Model 1: OLS baseline
reg emotive_rhetoric_log pm_questions queen_debate_day1 queen_debate_others m_questions u_questions, vce(cluster id_mp)

*Model 2: OLS with time trend
reg emotive_rhetoric_log pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend, vce(cluster id_mp) 

*Model 3: OLS baseline with MP fixed-effects
areg emotive_rhetoric_log pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend, vce(cluster id_mp) a(id_mp)

*Model 4: OLS with time trend, party fixed-effects and controls
areg emotive_rhetoric_log  pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet chair government  female age electoral_cycle linear_trend, vce(cluster id_mp) a(party)

*Model 5: OLS with time trend, MP fixed-effects and weighting by speech length
areg emotive_rhetoric_log pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend [aw=words], vce(cluster id_mp) a(id_mp)
	


**Table A9: OLS regression analysis of emotive rhetoric (LIWC dictionary)

*Model 1: OLS baseline
reg emotive_rhetoric_liwc pm_questions queen_debate_day1 queen_debate_others m_questions u_questions, vce(cluster id_mp)

*Model 2: OLS with time trend
reg emotive_rhetoric_liwc pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend, vce(cluster id_mp) 

*Model 3: OLS baseline with MP fixed effects
areg emotive_rhetoric_liwc pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend , vce(cluster id_mp) a(id_mp)

*Model 4: OLS with time trend, party fixed-effects and controls
areg emotive_rhetoric_liwc pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet government female age chair electoral_cycle linear_trend, vce(cluster id_mp) a(party)

*Model 5: OLS with time trend, MP fixed-effects and weighting by speech length
areg emotive_rhetoric_liwc pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend [aw=words], vce(cluster id_mp) a(id_mp)

	
	
**Table A10: Regression analysis of emotive rhetoric using multilevel models

*Model 1: Random intercepts at the MP level
mixed emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions || id_mp: 

*Model 2: Random intercepts at the MP and legislative period level
mixed emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet chair government female age electoral_cycle || id_mp: || period: 

*Model 3: Random intercepts at the MP and party level
mixed emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet chair government female age electoral_cycle || id_mp: || party: 

*Model 4: Random intercepts at the MP and topic level
mixed emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet chair government female age electoral_cycle || id_mp: || top_topic:



**Table A11: Number of speeches by topic and period
table top_topic period



**Table A12: OLS regression analysis of emotive rhetoric (with a control for party-level polarization)
gen polarization = .
replace polarization = 4.90599 if date<"2005-05-05"
replace polarization = 5.77743 if date>"2005-05-05" & date<"2010-05-06"
replace polarization = 5.60982 if date>"2010-05-06" & date<"2015-05-07"
replace polarization = 6.39171 if date>"2015-05-07" & date<"2017-06-08"
replace polarization = 5.86028 if date>"2017-06-08"


*Model 1:
reg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions polarization, vce(cluster id_mp)

*Model 2: OLS baseline (with time trends)
reg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend polarization, vce(cluster id_mp) 

*Model 3: OLS baseline with speaker fixed effects
areg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend polarization, vce(cluster id_mp) a(id_mp)

*Model 4: include controls
areg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet chair government  female age electoral_cycle linear_trend polarization, vce(cluster id_mp) a(party2)

*Model 5: weight by words
areg emotive_rhetoric pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend polarization [aw=words], vce(cluster id_mp) a(id_mp)



**Table A13: Number of Speeches by Gender and Topic
table female top_topic



**Table A14 OLS regression analysis of emotive rhetoric in speeches on selected topics
keep if top_topic=="external relations" | top_topic=="freedom and democracy" | top_topic=="political system" | top_topic=="economy"

areg emotive_rhetoric  pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet chair government  female age electoral_cycle linear_trend, vce(cluster id_mp) a(party)
	
	

**Further Analysis A (downsampling parameter 0.00075, see in-text discussion in Appendix E)

import delimited "data/speeches/uk_data.csv", encoding(UTF-8) clear 

*Model 1: OLS baseline
reg emotive_rhetoric_a1 pm_questions queen_debate_day1 queen_debate_others m_questions u_questions, vce(cluster id_mp)

*Model 2: OLS with time trend
reg emotive_rhetoric_a1 pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend, vce(cluster id_mp) 

*Model 3: OLS baseline with MP fixed-effects
areg emotive_rhetoric_a1 pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend, vce(cluster id_mp) a(id_mp)

*Model 4: OLS with time trend, party fixed-effects and controls
areg emotive_rhetoric_a1  pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet government female age chair electoral_cycle linear_trend, vce(cluster id_mp) a(party)

*Model 5: OLS with time trend, MP fixed-effects and weighting by speech length
areg emotive_rhetoric_a1 pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend [aw=words], vce(cluster id_mp) a(id_mp)


**Further Analysis B (15 epochs, see in-text discussion in Appendix E)

*Model 1: OLS with time trend
reg emotive_rhetoric_a2 pm_questions queen_debate_day1 queen_debate_others m_questions u_questions, vce(cluster id_mp)

*Model 2: OLS with time trend
reg emotive_rhetoric_a2 pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend, vce(cluster id_mp) 

*Model 3: OLS baseline with MP fixed-effects
areg emotive_rhetoric_a2 pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend, vce(cluster id_mp) a(id_mp)

*Model 4: OLS with time trend, party fixed-effects and controls
areg emotive_rhetoric_a2  pm_questions queen_debate_day1 queen_debate_others m_questions u_questions leader prime_minister senior_minister shadow cabinet government female age chair electoral_cycle linear_trend, vce(cluster id_mp) a(party)

*Model 5: OLS with time trend, MP fixed-effects and weighting by speech length
areg emotive_rhetoric_a2 pm_questions queen_debate_day1 queen_debate_others m_questions u_questions linear_trend [aw=words], vce(cluster id_mp) a(id_mp)